-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Improve FrozenDictionary.ToFrozenDictionary for Dictionary sources with different comparers
#120795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Improve FrozenDictionary.ToFrozenDictionary for Dictionary sources with different comparers
#120795
Conversation
| } | ||
| else if (newDictionary.Count != 0 && !newDictionary.Comparer.Equals(comparer)) | ||
| { | ||
| var dictionary = new Dictionary<TKey, TValue>(newDictionary.Count, comparer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could end up significantly over-allocating, no? Let's say the original use an Ordinal comparer, the new uses an OrdinalIgnoreCase comparer, and the original had every string entry copied many times just with different casing. The resulting dictionary's count is going to be much smaller than the originals, yet this will cause it to allocate much more memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub You're correct. Pre-allocating the dictionary with the original size may result in over-allocating when duplicate keys reduce the required capacity. I have removed the initial capacity specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. At that point then, is the only difference this new code path adds is iterating over a Dictionary<> in a strongly-typed way rather than iterating over a dictionary via IEnumerable<KeyValuePair<>>? In this state, have you tried running the before/after benchmarks on .NET 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the benchmark results accordingly.
BenchmarkDotNet v0.14.1-nightly.20250107.205, Windows 11 (10.0.26100.6584)
11th Gen Intel Core i7-11700 2.50GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 10.0.100-rc.1.25451.107
[Host] : .NET 10.0.0 (10.0.25.45207), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Job-SADIXB : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=DontRemove PowerPlanMode=00000000-0000-0000-0000-000000000000 Toolchain=CoreRun
IterationTime=250ms MaxIterationCount=20 MemoryRandomization=True
MinIterationCount=15 WarmupCount=1
| Method | Size | Mean | Error | StdDev | Median | Min | Max | Code Size | Gen0 | Gen1 | Gen2 | Allocated |
|-------------------- |------ |--------------:|--------------:|--------------:|--------------:|--------------:|--------------:|----------:|---------:|---------:|---------:|----------:|
| Original_Dictionary | 1 | 50.77 ns | 5.017 ns | 5.778 ns | 48.29 ns | 45.60 ns | 68.21 ns | 2,243 B | 0.0409 | - | - | 344 B |
| PR_Dictionary | 1 | 36.30 ns | 1.882 ns | 2.167 ns | 35.54 ns | 34.84 ns | 42.11 ns | 1,892 B | 0.0354 | - | - | 296 B |
| Original_Dictionary | 10 | 270.57 ns | 17.663 ns | 20.341 ns | 264.05 ns | 250.11 ns | 317.83 ns | 3,179 B | 0.1185 | - | - | 992 B |
| PR_Dictionary | 10 | 173.65 ns | 3.315 ns | 3.256 ns | 173.26 ns | 169.91 ns | 182.74 ns | 2,634 B | 0.1124 | - | - | 944 B |
| Original_Dictionary | 100 | 2,308.68 ns | 171.190 ns | 197.143 ns | 2,239.27 ns | 2,146.19 ns | 2,915.64 ns | 3,194 B | 0.9931 | 0.0170 | - | 8328 B |
| PR_Dictionary | 100 | 1,371.95 ns | 31.095 ns | 35.809 ns | 1,362.03 ns | 1,348.04 ns | 1,513.92 ns | 2,646 B | 0.9881 | 0.0165 | - | 8280 B |
| Original_Dictionary | 1000 | 19,235.59 ns | 697.651 ns | 803.416 ns | 18,891.83 ns | 18,611.36 ns | 21,727.45 ns | 2,992 B | 9.6957 | 1.8646 | - | 81304 B |
| PR_Dictionary | 1000 | 13,123.78 ns | 558.138 ns | 642.753 ns | 12,917.55 ns | 12,450.91 ns | 14,648.06 ns | 2,981 B | 9.6845 | 1.3478 | - | 81256 B |
| Original_Dictionary | 10000 | 257,255.96 ns | 26,300.558 ns | 30,287.762 ns | 246,032.88 ns | 224,345.99 ns | 362,544.12 ns | 2,993 B | 124.0672 | 124.0672 | 124.0672 | 753284 B |
| PR_Dictionary | 10000 | 179,099.95 ns | 6,462.982 ns | 7,442.779 ns | 178,413.09 ns | 168,169.41 ns | 198,089.87 ns | 1,573 B | 124.3421 | 124.3421 | 124.3421 | 753236 B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub from the updated benchmark result I see the original one that reserves the initial capacity allocated far less than the new one that doesn't do it. Instead of letting the backing storage to resizing its size to 2x each time (which would also result in over-allocating), pre-allocating the capacity that may be a bit more than the actual need seems definite more feasible to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's going to depend on how much of a difference is expected in count from the original collection that's using a different comparer. The benchmark is using a comparer with identical semantics, so the ideal is to just use the exact same capacity. If, however and for example, the original collection was using Ordinal, the new collection was using OrdinalIgnoreCase, and the original collection contained 1000 words each with 10 different casing variations, choosing that original count is going to result in 10x the allocation required whereas the normal growth strategy will only be 2x.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially suggested pre-allocating, but after considering @stephentoub's concern about the worst-case scenario, I agree that not setting the initial capacity is the better approach.
Since the optimal capacity depends on on the comparer's behavior, the dynamic growth strategy is more robust.
|
Tagging subscribers to this area: @dotnet/area-system-collections |
Summary
This PR improves the performance of
FrozenDictionary.ToFrozenDictionarymethod.Implementation Approach
When the source is already
Dictionary<TKey, TValue>and satisfies the conditionDictionary.Count != 0 && !Dictionary.Comparer.Equals(comparer), the processing is changed to the following point.Benchmark